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ABSTRACT 

This paper presents Cologne, a declarative optimization platform 
that enables constraint optimization problems (COPs) to be declar- 
atively specified and incrementally executed in distributed systems. 
Cologne integrates a declarative networking engine with an off-the- 
shelf constraint solver. We have developed the Colog language that 
combines distributed Datalog used in declarative networking with 
language constructs for specifying goals and constraints used in 
COPs. Cologne uses novel query processing strategies for process- 
ing Colog programs, by combining the use of bottom-up distributed 
Datalog evaluation with top-down goal-oriented constraint solving. 
Using case studies based on eloud and wireless network optimiza- 
tions, we demonstrate that Cologne (1) can flexibly support a wide 
range of policy-based optimizations in distributed systems, (2) re- 
sults in orders of magnitude less code compared to imperative im- 
plementations, and (3) is highly efficient with low overhead and 
fast convergence times. 

1. INTRODUCTION 

In distributed systems management, operators often have to eon- 
figure system parameters that optimize performance objectives 
given constraints in the deployment environment. For instance, in 
distributed data centers, cloud operators need to optimize place- 
ment of virtual machines (VMs) and storage resources to meet cus- 
tomer service level agreements (SLAs) while keeping operational 
costs low. In a completely different scenario of a wireless mesh 
network, each wireless device needs to configure its selected chan- 
nel for communication in order to ensure good network throughput 
and minimize data losses. 

This paper presents Cologne {Constraint LOGic EngiNE), a 
declarative optimization platform that enables constraint optimiza- 
tion problems (COPs) to be declaratively specified and incremen- 
tally executed in distributed systems. Traditional approaches in im- 
plementing COPs use imperative languages like C-i~i- [2] or Java [ 1 ] . 
This often results in thousands Unes of code, that are difficult to 
maintain and customize. Moreover, due to scalability issues and 
management requirements imposed across administrative domains, 
it is often necessary to execute a COP in a distributed setting, where 
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multiple local solvers coordinate with each other and each one han- 
dles a portion of the whole problem to together achieve a global 
objective. 

The paper makes the following contributions: 

• Declarative platform. Central to our optimization platform is 
the integration of a declarative networking [19] engine with an 
off-the-shelf constraint solver [2]. We have developed the Colog 
language that combines distributed Datalog used in declarative 
networking with language constructs for specifying goals and 
constraints used in COPs. 

• Distributed constraint optimizations. To execute Colog pro- 
grams in a distributed setting, Cologne integrates Gecode [2], an 
off-the-shelf constraint solver, and the RapidNet declarative net- 
working engine [5] for communicating policy decisions among 
different solver nodes. Supporting distributed COP operations 
requires novel extensions to state of the art in distributed Data- 
log processing, which is primarily designed for bottom-up evalu- 
ation. One of the interesting aspects of Colog, from a query pro- 
cessing standpoint, is the integration of RapidNet (an incremen- 
tal bottom-up distributed Datalog evaluation engine) and Gecode 
(a top-down goal-oriented constraint solver). This integration al- 
lows us to implement a distributed solver that can perform in- 
cremental and distributed constraint optimizations - achieved 
through the combination of bottom-up incremental evaluation 
and top-down constraint optimizations. Our integration is 
achieved without having to modify RapidNet or Gecode, hence 
making our techniques generic and applicable to any distributed 
Datalog engine and constraint solver 

• Use cases. We have applied our platform to two representative 
use cases that allow us to showcase key features of Cologne. 
First, in automated cloud resource orchestration [16], we use 
our optimization framework to declaratively control the creation, 
management, manipulation and decominissioning of cloud re- 
sources, in order to realize customer requests, while conform- 
ing to operational objectives of the cloud service providers at the 
same time. Second, in mesh networks, policies on wireless chan- 
nel selection [14] are declaratively specified and optimized, in 
order to reduce network interference and maximize throughput, 
while not violating constraints such as refraining from channels 
owned exclusively by the primary users. Beyond these two use 
cases, we envision our platform has a wide-range of potential ap- 
plications, for example, optimizing distributed systems for load 
balancing, robust routing, scheduling, and security. 

• Evaluation. We have developed a prototype of Cologne and 
have performed extensive evaluations of our above use cases. 
Our evaluation demonstrates that Cologne (1) can flexibly sup- 
port a wide range of policy-based optimizations in distributed 



752 



systems, (2) results in orders of magnitude less code compared 
to imperative implementations, and (3) is highly efficient with 
low overhead and fast convergence times. 

The rest of the paper is organized as follows. Section 2 presents 
an architecture overview of Cologne. Section 3 describes our two 
main use cases that are used as driving examples throughout the 
paper. Section 4 presents the Colog language and its execution 
model. Section 5 next describes how Colog programs are compiled 
into distributed execution plans. Section 6 presents our evaluation 
results. We then discuss related work in Section 7 and conclude in 
Section 8. 



2. SYSTEM OVERVIEW 
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Figure 1: Cologne overview in distributed mode. 

Figure 1 presents a system overview of Cologne, which is de- 
signed for a distributed environment comprising of several net- 
worked nodes. Cologne can be deployed in a centralized or dis- 
tributed mode: 

In the centralized deployment mode, the entire distributed sys- 
tem is configured by one centralized Cologne instance. It takes as 
input system states gathered from all nodes in the network, and a 
set of policy constraints and goals specified using Colog declara- 
tive language. These specifications are then used by a constraint 
solver to automatically generate optimization commands. These 
commands arc then input into each node's configuration layer, to 
generate physical operations to directly manipulate resources at 
each node. 

In the distributed deployment mode, there are multiple Cologne 
instances, typically one for each node. In a general setting, each 
node has a set of neighbor nodes that it can directly communicate 
with (either via wireless communication links, dedicated backbone 
networks, or the Internet). A distributed query engine [5] is used to 
coordinate the exchange of system states and optimization output 
amongst Cologne instances, in order to achieve a global objective 
(this typically results in an approximate solution). 

A distributed deployment brings two advantages. First, 
distributed environments like federated cloud [9] may be adminis- 
tered by different cloud providers. This necessitates each provider 
running its own Cologne instance for its internal configuration, but 
coordinating with other Cologne instances for inter data center con- 
figurations. Second, even if the entire distributed system is entirely 



under one administrative domain, for scalability reasons of con- 
straint optimization, each node may choose to configure a smaller 
set of resources using local optimization commands. 

The configuration layer is specific to individual use case (Sec- 
tion 3). For instance, in a cloud environment, each node can rep- 
resent a data center's resource controller. Hence, the configuration 
layer is a cloud orchestration engine [17]. On the other hand, in a 
wireless mesh network setting, each node denotes a wireless node, 
and the configuration layer may refer to a node's routing [15] or 
channel configuration layer [14]. 

3. USE CASE EXAMPLES 

We present two use cases of Cologne, based on cloud resource 
orchestration [16, 17] and wireless network configuration [14]. The 
two cases are vastly different in their deployment scenarios - hence 
are useful at demonstrating the wide applicability of Cologne. We 
will primarily frame our discussions of the use cases in terms of 
COP expressed mathematically, and defer the declarative language 
specifications and runtime support for realizing these COP compu- 
tations to later sections. 

3.1 Cloud Resource Orchestration 

Our first use case is based on cloud resource orchestration [16], 
which involves the creation, management, manipulation and de- 
commissioning of cloud resources, including compute, storage and 
network devices, in order to reaUze customer SLAs, while con- 
forming to operational objectives of the cloud service providers at 
the same time. 

Cologne allows cloud providers to formally model cloud resources 
and formulate orchestration decisions as a COP given goals and 
constraints. Based on Figure 1, the distributed system consists of 
a network of cloud controllers (nodes), each of which runs a cloud 
resource orchestration engine [17] as its configuration layer, co- 
ordinating resources across multiple distributed data centers. At 
each node, each Cologne engine utilizes a constraint solver for ef- 
ficiently generating the set of orchestration commands, and a dis- 
tributed query engine for communicating policy decisions among 
different Cologne instances. 

Cologne provides a unified framework for mathematically mod- 
eling cloud resources orchestration as a COP. Operational objec- 
tives and customer SLAs are specified in terms of goals, which are 
subjected to a number of constraints specific to the cloud deploy- 
ment scenario. These specifications are then fed to Cologne, which 
automatically synthesizes orchestration commands. 

We use the following two scenarios (ACloud and Follow-the- 
Sun) as our driving examples throughput the paper. Both examples 
are representative of cloud resource orchestration scenarios within 
and across data centers, respectively. 

3.1.1 ACloud {Adaptive Cloud) 

In ACloud, a customer may spawn new VMs from an existing 
disk image, and later start, shutdown, or delete the VMs. In today's 
deployment, providers typically perform load balancing in an ad- 
hoc fashion. For instance, VM migrations can be triggered at an 
overloaded host machine, whose VMs are migrated to a randomly 
chosen machine cvurently with light load. While such ad-hoc ap- 
proaches may work for a specific scenario, they are unUkely to re- 
sult in configurations that can be easily customized upon chang- 
ing policy constraints and goals, whose optimality cannot be easily 
quantified. 

As an alternative, Cologne takes as input real-time system states 
(e.g. CPU and memory load, migration feasibility), and a set of 
policies specified by the cloud provider. An example optimization 
goal is to reduce the cluster-wide CPU load variance across all host 
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machines, so as to avoid liot-spots. Constraints can be tied to eachi 
machine's resource availability (e.g. each machine can only run up 
to a fixed number of VMs, run certain classes of VMs, and not 
exceed its physical memory limit), or security concerns (VMs can 
only be migrated across certain tj^jes of hosts). 

Another possible policy is to minimize the total number of VM 
migrations, as long as a load variance threshold is met across all 
hosts. Alternatively, to consolidate workloads one can minimize 
the number of machines that are hosting VMs, as long as each ap- 
plication receives sufficient resources to meet customer demands. 
Given these optimization goals and constraints, Cologne can be ex- 
ecuted periodically, triggered whenever imbalance is observed, or 
whenever VM CPU and memory usage changes. 

3.1.2 Follow-the-Sun 

Our second motivating example is based on the Follow-the-Sun 
scenario [26], which aims to migrate VMs across geographical dis- 
tributed data centers based on customer dynamics. Here, the ge- 
ographic location of the primary workload (i.e. majority of cus- 
tomers using the cloud service) derives demand shifts during the 
course of a day, and it is beneficial for these workload drivers to be 
in close proximity to the resources they operate on. The migration 
decision process has to occur in real-time on a live deployment with 
minimal disruption to existing services. 

In this scenario, the workload migration service aims to optimize 
for two parties: for providers, it enables service consolidation to 
reduce operating costs, and for customers, it improves application 
performance while ensuring that customer SLAs of web services 
(e.g. defined in terms of the average end-to-end experienced latency 
of user requests) are met. In addition, it may be performed to reduce 
inter-data center communication overhead [30, 7]. Since data cen- 
ters in this scenario may belong to cloud providers in different ad- 
ministrative domains (similar to federated cloud [9]), Follow-the- 
Sun may be best suited for a distributed deployment, where each 
Cologne instance is responsible for controlling resources within 
their data center. 

We present a COP-based mathematical model of the Follow-the- 
Sun scenario. In this model, there are n autonomous geographically 
distributed data centers Ci, C„ at location 1, 2, n. Each data 
center is managed by one Cologne instance. Each site Ci has a re- 
source capacity (set to the maximum number of VMs) denoted as 
Ri. Each customer specifies the number of VMs to be instantiated, 
as well as a preferred geographic location. We denote the aggre- 
gated resource demand at location j as Dj, which is the sum of 
total number of VMs demanded by all customers at that location. 
Given the resource capacity and demand, Cj currently allocates Aji 
resources (VMs) to meet customer demand Dj at location j. 

In the formulation, Mijk denotes the number of VMs migrated 
from Ci to Cj to meet Dk. Migration is feasible only if there is a 
link Lij between Cj and Cj. When Mijk > 0, the cloud orchestra- 
tion layer will issue commands to migrate VMs accordingly. This 
can be periodically executed, or executed on demand whenever sys- 
tem parameters (e.g. demand D or resource availability R) change 
drastically. 

A naive algorithm is to always migrate VMs to customers' pre- 
ferred locations. However, it could be either impossible, when the 
aggregated resource demand exceeds resource capacity, or subop- 
timal, when the operating cost of a designated data center is much 
more expensive than neighboring ones, or when VM migrations 
incur enormous migration cost. 

In contrast, Cologne's COP approach attempts to optimize based 
on a number of factors captured in the cost function. In the model, 
we consider three main kinds of cost: (I) operating cost of data cen- 
ter Cj is defined as OCj, which includes typical recurring costs of 



operating a VM at Cj ; (2) communication cost of meeting resource 
demand Di from data center Cj is given as CCij\ (3) migration 
cost MCij is the communication overhead of moving a VM from 
Ci to Cj. Given the above variables, the COP formulation is: 

min {aggOC + aggCC + aggMC) (1) 

n n n 

aggOC = (E ^"^H + E ^^^i ) * ) ^2) 
j=l i=l k=l 

n n n 

aggCC = ^ ^^((Ai,- + ^ Mkji) * Cdj) (3) 

3 = 1 i=l k=l 

n n n 

aggMC = ^ ^^((X] max{Mijk,0)) * Mdj) (4) 

i=l3=l fc=l 

subject to: 

n n 

Vj : Rj > Y^iA.j + J2 ^kji) (5) 
i=l k=l 

\/i,j,k:Mijk + Mjik = (6) 

Optimization goal. The COP aims to minimize the aggregate 
cost of cloud providers. In the above formulation, it is defined as 
the sum of the aggregate operating cost aggOC in (2) across all 
data centers, the aggregate communication cost aggCC in (3) to 
meet customer demands served at various data centers, and the ag- 
gregate VM migration cost aggMC in (4), all of which are com- 
puted by summing up OCj , Cdj , and M dj for the entire system. 

Constraints. The COP is subjected to two representative con- 
straints. In Constraint (5), each data center cannot allocate more 
resources than it possesses. Constraint (6) ensures the zero-sum 
relation between migrated VMs between d and Cj for demand k. 

3.2 Wireless Network Configuration 

Our second use case is based on optimizing wireless networks by 
adjusting the selected channels used by wireless nodes to commu- 
nicate with one another [14]. In wireless networks, communication 
between two adjacent nodes (within close radio range) would re- 
sult in possible interference. As a result, a popular optimization 
strategy performed is to carefully configure channel selection and 
routing policies in wireless mesh networks [II, 12]. These pro- 
posals aim to mitigate the impact of harmful interference and thus 
improve overall network performance. For reasonable operation of 
large wireless mesh networks with nodes strewn over a wide area 
with heterogeneous policy constraints and traffic characteristics, a 
one-size-fits-all channel selection and routing protocol may be dif- 
ficult, if not impossible, to find. 

To address the above needs, Cologne serves as a basis for de- 
veloping intelligent network protocols that simultaneously control 
parameters for dynamic (or agile) spectrum sensing and access, dy- 
namic channel selection and medium access, and data routing with 
a goal of optimizing overall network performance. 

In Cologne, channel selection policies are formulated as COPs 
that are specified using Colog. The customizability of Colog al- 
lows providers a great degree of flexibility in the specification and 
enforcement of various local and global channel selection policies. 
These policy specifications are then compiled into efficient con- 
straint solver code for execution. Colog can be used to express 
both centralized and distributed channel selection protocols. 

Appendix A.I gives examples of wireless channel selection ex- 
pressed as mathematical COP formulations. 



754 



4. COLOG LANGUAGE 

Cologne uses a declarative policy language Colog to concisely 
specify the COP formulation in the form of policy goals and con- 
straints. Using as examples ACloud and FoUow-the-Sun from Sec- 
tion 3.1, we present the Colog language and briefly describe its 
execution model. Additional examples involving wireless network 
configurations are presented in Appendix A.2 and A.3. 

Colog is based on Datalog, a recursive query language used in 
the database community for querying graphs. Our choice of Data- 
log as a basis for Colog is driven by Datalog's conciseness in speci- 
fying dependencies among system states, including distributed sys- 
tem states that exhibit recursive properties. Its root in logic pro- 
vides a convenient mechanism for expressing solver goals and con- 
straints. Moreover, there exists distributed Datalog engines [5] that 
will later facilitate distributed COP computations. In the rest of this 
section, we first introduce centralized Colog (without constructs for 
distribution), followed by distributed Colog. 

4.1 Datalog Conventions 

In our paper, we use Datalog conventions in [22], in presenting 
Colog. A Datalog program consists of a set of declarative rules. 
Each rule has the form p <- qi, q2, qn ., which can be read 

informally as "qi and q2 and . . . and qn implies p". Here, p is the 
head of the rule, and qi, q2,...,qn is a list of literals that constitutes 
the body of the rule. Literals are either predicates with attributes, or 
boolean expressions that involve function symbols (including arith- 
metic) applied to attributes. The predicates in traditional Datalog 
rules are relations, and we will refer to them interchangeably as 
predicates, relations, or tables. 

Datalog rules can refer to one another in a mutually recursive 
fashion. The order in which the rules are presented in a program 
is semantically immaterial; likewise, the order predicates appear in 
a rule is not semantically meaningful. Commas are interpreted as 
logical conjunctions (AND). Conventionally, the names of predi- 
cates, function symbols, and constants begin with a lowercase let- 
ter, while attribute names begin with an uppercase letter. Func- 
tion calls are additionally prepended by f_. Aggregate constructs 
(e.g. SUM, MIN, MAX) are represented as functions with attributes 
within angle brackets (<>). 

4.2 Centralized Colog 

Colog extends traditional Datalog with constructs for expressing 
goals and constraints and also distributed computations. We defer 
the discussion of distribution to Section 4.3, and primarily focus on 
centralized Colog here. 

Colog specifications are compiled into execution plans executed 
by a Datalog evaluation engine that includes modules for constraint 
solving. In Colog program, two reserved keywords qoai and var 
specify the goal and variables used by the constraint solver The 
type of goal is either minimize, maximize or satisfy. As its name 
suggests, the first two minimizes or maximizes a given objective, 
and the third one means to find a solution that satisfies all given 
constraints. 

Colog has two types of table attributes - regular and solver. A 
regular attribute is a conventional Datalog table attribute, while a 
solver attribute is either a constraint solver variable or is derived 
from existing ones. The difference between the two is that the 
actual value of a regular attribute is determined by facts within a 
database, e.g. it could be an integer, a string, or an IP address. On 
the other hand, the value of a solver attribute is only determined by 
the constraint solver after executing its optimization modules. 

We refer to tables that contain solver attributes as solver tables. 
Tables that contain only regular attributes are referred to as regular 



tables, which are essentially traditional Datalog based and derived 
tables. 

Given the above table types, Colog includes traditional Datalog 
rules that only contain regular tables, and solver rules that contain 
one or more solver tables. These solver rules can further be catego- 
rized as derivation or constraint rules: 

• A solver derivation rule derives intermediate solver variables 
based on existing ones. Like Datalog rules, these rules have the 
form p which results in the derivation of 
P whenever the rule body (qi and q2 and . . . and qn) is true. 
Unlike regular Datalog rules, the rule head p is a solver table. 

• A solver constraint rule has the form p -> qi , q2 , . . . , qn . , 

denoting the logical meaning that whenever the rule head p is 
true, the rule body (qi and q2 and . . . and qn) must also be true 
to satisfy the constraint. In Cologne, all constraint rules involve 
one or more solver tables in either the rule body or head. Unlike a 
solver derivation rule, which derives new variables, a constraint 
restricts a solver attribute's allowed values, hence representing 
an invariant that must be maintained at all times. Constraints are 
used by the solver to limit the search space when computing the 
optimization goal. 

A compiler can statically analyze a Colog program to determine 
whether it is a Datalog rule, or a solver derivation/constraint rule. 
For ease of exposition in the paper, we add a rule label prefix r, 
d, and c to regular Datalog, solver derivation, and solver constraint 
rules respectively. 

As an example, the following program expresses a COP that aims 
to achieve load-balancing within a data center for the ACloud re- 
source orchestration scenario in Section 3.1. This example is cen- 
tralized, and we will revisit the distributed extensions in the next 
section. 

goal minimize C in hostStdevCpu (C) . 

var assign (Vid, Hid, V) forall toAssign (Vid, Hid) . 

rl toAssign (Vid, Hid) <- vm (Vid, Cpu, Mem) , 

host (Hid,Cpu2,Mem2) . 

dl hostCpu (Hid, SUM<C>) <- assign (Vid, Hid, V) , 

vm (Vid, Cpu, Mem) , C==V*Cpu. 
d2 hostStdevCpu (STDEV<C>) <- host (Hid, Cpu, Mem) , 

hostCpu (Hid, Cpu2 ) , C==Cpu+Cpu2 . 

d3 assignCount (Vid, SUM<V>) <- assign (Vid, Hid, V) . 
cl assignCount (Vid, V) -> V==l . 

d4 hostMem (Hid, SUM<M>) <- assign (Vid, Hid, V) , 

vm (vid, Cpu, Mem) , M^^V*Mem. 
c2 hostMem (Hid, Mem) -> hostMemThres (Hid, M) , Mem<=M. 

Program description. The above program takes as input 
vm (Vid, Cpu, Mem) and host (Hid, Cpu, Mem) tables, which are reg- 
ular tables. Each vm entry stores information of a VM uniquely 
identified by vid. Additional monitored information (i.e. its CPU 
utilization cpu and memory usage Mem) are also supplied in each 
entry. This monitored information can be provided by the cloud in- 
frastructure, which regularly updates CPU and memory attributes 
in the vm table. The host table stores the hosts' CPU utilization 
Cpu and memory usage Mem. Given these input tables, the above 
program expresses the following: 

• Optimization goal: Minimize the CPU standard deviation at- 
tribute C in hostStdevCpu. 

• Variables: As output, the solver generates assign (vid, Hid, v) 
entries, v are solver variables, where each entry indicates VM 
vid is assigned to host Hid if v is 1 (otherwise 0). 
assign (Vid, Hid, v) is bounded via the keyword foraii to 
toAssign table, generated by joining vm with host in rule ri. 



755 



• Solver derivations: Rule di aggregates the CPU of all VMs 
running on each host. Rule d2 takes the output from di and then 
computes the system-wide standard deviation of the aggregate 
CPU load across all hosts. The output from 6.2 is later used by 
the constraint solver for exploring the search space that meets 
the optimization goal. In most (if not all) Colog programs, the 
final optimization goal is derived from (or dependent on) solver 
variables. 

• Solver constraints: Constraint ci expresses that each VM is as- 
signed to one and only one host, via first aggregating the number 
of VM assignments in rule d3. Similarly, constraint c2 ensures 
that no host can accommodate VMs whose aggregate memory 
exceeds its physical limit, as defined in hostMemihres. 

To invoke actual constraint solving, Colog uses a reserved event 
invokesoiver to trigger the Optimization computation. This event 
can be generated either periodically, or triggered based on an event 
(local table updates or network messages). To restrict the maximum 
solving time for each COP execution, one can set the parameter 

SOLVE RJIAX.TIME. 

Using Colog, it is easy to customize policies simply by modify- 
ing the goals, constraints, and adding additional derivation rules. 
For instance, we can add a rule (continuous query) that triggers 
the COP program whenever load imbalance is observed (i.e. c in 
hoststdevCpu cxceods a threshold). Alternatively, we can opti- 
mize for the fewest number of unique hosts used for migration 
while meeting customer SLAs when consolidating workloads. If 
the overhead of VM migration is considered too high, we can limit 
the number of VM migrations, as demonstrated by the rules below. 

d5 migrate (Vid, Hidl, Hid2, C) <- assign (Vid, Hidl , V) , 
origin (Vid, Hid2) , Hidl!=Hid2, (V==l ) == (C==l ) . 
d6 migrateCount (SUM<C>) <- migrate (Vid, Hidl , Hid2 , C) . 
c3 migrateCount (C) -> C<=max_migrates . 

In rule ds, the origin table records the current VM-to-host map- 
pings, i.e. VM vid is running on host Hid. Derivation rules d5-6 
counts how many VMs are to be migrated after optimization. In 
d5, (v==i)==(c==i) means that if v is 1 (i.e. VM vid is assigned 
to host Hidl), then c is 1 (i.e. migrate VM vid from host Hid2 to 
Hidl). Otherwise, c is not 1. Constraint rule c3 guarantees that the 
total number of migrations does not exceed a pre-defined threshold 

maxjnigrates. 

4.3 Distributed Colog 

Colog can be used for distributed optimizations, and we intro- 
duce additional language constructs to express distributed compu- 
tations. Colog uses the location specifier a construct used in declar- 
ative networking [19], to denote the source location of each corre- 
sponding tuple. This allows us to write rules where the input data 
spans across multiple nodes, a convenient language construct for 
formulating distributed optimizations. 

To provide a concrete distributed example, we consider a dis- 
tributed implementation of the FoUow-the-Sun cloud resource or- 
chestration model introduced in Section 3.1. At a high level, we 
utilize an iterative distributed graph-based computation strategy, in 
which all nodes execute a local COP, and then iteratively exchange 
COP results with neighboring nodes until a stopping condition is 
reached. In this execution model, data centers are represented as 
nodes in a graph, and a link exists between two nodes if resources 
can be migrated across them. The following Colog program imple- 
ments the local COP at each node x: 

goal minimize C in aggCost (@X, C) . 

var migVm (@X, Y, D, R) forall toMigVm ( @X, Y, D) . 

rl toMigVm(@X,Y,D) <- setLink ( ex, Y) , dc(ax,D). 



COP 


Colog 


symbol Ri 


resource (I, R) 


symbol d 


dc(I,C) 


symbol Lij 


link (I, J) 


symbol Aij 


curVm(I, J,R) 


symbol Mijk 


migVm (I, J, K, R) 


equation (1) 


rule goal, d8 


equation (2) 


rule d4,d6 


equation (3) 


rule d3,d5 


equation (4) 


rule d7 


equation (5) 


rule d9-io, ci-2 


equation (6) 


rule r2 



Table 1: Mappings from COP to Colog. 



II next-step VM allocations after migration 
dl nextVm (@X, D, R) <- curVm ( @X, D, Rl ) , 

migVm(@X,Y,D,R2) , R==R1-R2. 
d2 nborNextVm (@X, Y, D, R) <- link(@Y,X), curVm ( @ Y, D, Rl ) , 

migVm(eX, Y,D,R2) , R==R1+R2. 

// communication, operating and migration cost 
d3 aggCommCost (ex, SUM<Cost>) <- nextVm (gx, D, R) , 

commCost (@X,D, C) , Cost==R*C. 
d4 aggOpCost (lix, SUM<Cost>) <- nextVm (gX, D, R) , 

opCost ((3X, C) , Cost==R*C. 
d5 nborAggCommCost ((3X, SUM<Cost>) <- link ((ay, X), 

commCost ((ay, D, C) , nborNextVm (@x, Y, D, R) , Cost==R*C. 
d6 nborAggOpCost (@X, SUM<Cost>) <- link ((ay, X), 

opCost (@Y, C) , nborNextVm (@X, Y, D, R) , Cost==R*C. 
d7 aggMigCost ((3X, SUMABS<Cost>) <- migVm ((ax, Y, D, R) , 

migCost (ex, Y,C) , Cost==R*C. 

// total cost 

d8 aggCost (ex, C) <- aggCommCost (ex, CI) , 

aggOpCost (@x, C2) , aggMigCost (ex, C3) , 
nborAggCommCost (SX, C4) , nborAggOpCost (ex, 05) , 
C==C1+C2+C3+C4+C5 . 

// not exceeding resource capacity 
d9 aggNextVm(ex, SUM<R>) <- nextVm (ex, D, R) . 
cl aggNextVm(ex,Rl) -> resource (ex, R2) , R1<=R2 . 
dlO aggNborNextVm(ex,Y,SUM<R>) <- nborNextVm (ex, Y, D, R) . 
c2 aggNborNextVm(ex, Y,R1) -> link(eY,X), 
resource (eY, R2) , R1<=R2 . 

// propagate to ensure symmetry and update allocations 
r2 migVm(eY,X,D,R2) <- setLink (BX, Y) , 

migVm(ex, Y,D,R1) , R2:=-R1. 
r3 curVm(ex,D,R) <- curVm ( ex, D, Rl ) , 

migVm (ex, Y, D, R2) , R:=R1-R2. 

Program description. Table 1 summarizes the mapping from COP 
symbols to Colog tables, and COP equations to Colog rules identi- 
fied by the rule labels. For instance, each entry in table Ri is stored 
as a resource (I, R) tuple. Likewise, the r attribute in 
migvmd, j,K,R) stores the value of Mjjfc. The distributed COP 
program works as follows. 

• Optimization goal: Instead of minimizing the global total cost 
of all data centers, the optimization goal of this local COP is the 
total cost c in aggCost within a local region, i.e. node x and one 
of its neighbors y. 

• COP execution trigger: Periodically, each node x randomly 
selects one of its neighbors y (denoted as a link ( ex, y) entry) 
to initiate a WM migration process' setLink ( ex, y) contains the 

'To ensure that only one of two adjacent nodes initiates the VM mi- 
gration process, for any given link (x, y) , the protocol selects the 
node with the larger identifier (or address) to carry out the subse- 
quent process. This distributed link negotiation can be specified in 
13 Colog rules, which we omit due to space constraints. 



756 



pair of nodes participating in the VM migration process. This 
in essence results in the derivation of toMigvm in rule ri, which 
directly triggers the execution of the local COP (implemented by 
the rest of the rules). The output of the local COP determines the 
quantity of resources migvm(ex, y,d,r) that are to be migrated 
between x and y foraii entries in toMigvm. 

• Solver derivations: During COP execution, rule di and d2 com- 
pute the next-step VM allocations after migration for node x and 
Y, respectively. Rule d3-6 derive the aggregate communication 
and operating cost for the two nodes. We note that rule d2 and 

d5-6 are distributed solver derivation rules (i.e. not all rule ta- 
bles are at the same location), and node x collects its neighbor 
y's information (e.g. curvm, corrmcost and opCost) via implicit 
distributed communications. Rule d7 derives the migration cost 
via aggregate keyword sumabs, which svmis the absolute values 
of given variables. Rule ds derives the optimization objective 
aggcost by summing all communication, operating and migra- 
tion cost for both node x and y. 

• Solver constraints: Constraints ci and c2 express that after mi- 
gration node X and y must not have too many VMs which exceed 
their resource capacity given by table resource. Rule c2 is a dis- 
tributed constraint rule, where x retrieves neighbor y's resource 
table over the network to impose the constraint. 

• Stopping condition: At the end of each COP execution, the mi- 
gration result migvm is propagated to immediate neighbor y to 
ensure symmetry via rule r2. Then in rule r3 both node x and 
Y update their curvm to reflect the changes incurred by VM mi- 
gration. Above process is then iteratively repeated until all links 
have been assigned values, i.e. migration decisions between any 
two neighboring data centers have been made. In essence, one 
can view the distributed program as a series of per-node COPs 
carried out using each node's constraint solver. The complexity 
of this program depends upon the maximum node degree, since 
each node at most needs to perform m rounds of Unk negotia- 
tions, where m is the node degree. 

Our use of Colog declarative language provides ease in policy 
customizations. For example, we can impose restrictions on the 
maximum quantity of resources to be migrated due to factors like 
high CPU load or router traffic in data centers, or impose con- 
straints that the total cost after optimization should be smaller by 
a threshold than before optimization. These two policies can be 
defined as rules below. 

dll aggMigVm(@X, Y, SUMABS<R>) <- mlgVm (@X, Y, D, R) . 
c3 aggMigVm ( @X, Y, R) -> R<=max_migrates . 

c4 aggCost (@X, C) -> originCost (@X, C2) , C<=cost_thres*C2 . 

Rule all derives total VM migrations between x and y. Con- 
straint c3 ensures that total migrations do not exceed a pre-defined 
threshold maxjnigrates. Rulc c4 guarantees that aggcost after mi- 
gration is below the product of the original cost origincost and a 
threshold cost.thres. originCost Can be derived by additional 5 
Colog rules which are omitted here. 

In distributed COP execution, each node only exposes limited 
information to their neighbors. These information includes curVm, 
commcost, opCost and resource, as demonstrated in rules d2, d5-6 
and c2. This leads to better autonomy for each Cologne instance, 
since there does not exist a centralized entity which collects the in- 
formation of all nodes. Via distributing its computation, Colog has 
a second advantage: by decomposing a big problem (e.g. VM mi- 
grations between all data centers) into multiple sub-problems (e.g. 
VM migrations on a single link) and solving each sub-problem in 
a distributed fashion, it is able to achieve better scalability as the 
problem size grows via providing approximate solutions. 



5. EXECUTION PLAN GENERATION 

This section describes the process of generating execution plans 
from Colog programs. Cologne's compiler and runtime system are 
implemented by integrating a distributed query processor (used in 
declarative networking) with an off-the-shelf constraint solver 

In our implementation, we use the RapidNet [5] declarative net- 
working engine together with the Gecode [2] high performance 
constraint solver. However, the techniques describe in this section 
is generic and can be applied to other distributed query engines and 
solvers as well. 

5.1 General Rule Evaluation Strategy 

Cologne uses a declarative networking engine for executing dis- 
tributed Datalog rules, and as we shall see later in the section, 

for implementing solver derivation and enforcing solver constraint 
rules. A declarative networking engine executes distributed Dat- 
alog programs using an asynchronous evaluation strategy known 
as pipelined semi-naive (PSN) [18] evaluation strategy. The high- 
level intuition here is that instead of evaluating Datalog programs 
in fixed rounds of iterations, one can pipeline and evaluate rules 
incrementally as tuples arrive at each node, until a global fixpoint 
is reached. To implement this evaluation strategy, Cologne adopts 
declarative networking's execution model. Each node runs a set 
of local delta rules, which are implemented as a dataflow consist- 
ing of database operators for implementing the Datalog rules, and 
additional network operators for handling incoming and outgoing 
messages. All rules are executed in a continuous, long-running 
fashion, where rule head tuples are continuously updated (inserted 
or deleted) via a technique known as incremental view mainte- 
nance [20] as the body predicates are updated. This avoids having 
to recompute a rule from scratch whenever the inputs to the rule 
change. 

A key component of Cologne is the integration of a distributed 
query processor and a constraint solver running at each node. At 
a high level, Colog solver rules are compiled into executable code 
in RapidNet and Gccodc. Our compilation process maps Colog's 
goal, var, solver derivations and constraints into equivalent COP 
primitives in Gecode. Whenever a solver derivation rule is exe- 
cuted (triggered by an update in the rule body predicates), Rapid- 
Net invokes Gecode's high-performance constraint solving mod- 
ules, which adopts the standard branch-and-bound searching ap- 
proach to solve the optimization while exploring the space of vari- 
ables under constraints. 

Gecode's solving modules are invoked by first loading in appro- 
priate input regular tables from RapidNet. After executing its op- 
timization modules, the optimization output (i.e. optimization goal 
goal and variables var) are materialized as RapidNet tables, which 
may trigger reevaluation of other rules via incremental view main- 
tenance. 

5.2 Solver Rules Identification 

In order to process solver rules, Cologne combines the use of 
the basic PSN evaluation strategy with calls to the constraint solver 
at each node. Since these rules are treated differently from regular 
Datalog rules, the compiler needs to identify solver rules via a static 
analysis phase at compile time. 

The analysis works by first identifying initial solver variables de- 
fined in var. Solver attributes are then identified by analyzing each 
Colog rule, to identify attributes that are dependent on the initial 
solver variables (either directly or transitively). Once an attribute is 
identified as a solver attribute, the predicates that refer to them are 
identified as solver tables. Rules that involve these solver tables are 
hence identified as solver rules. Solver derivation and constraint 
rules are differentiated trivially via rule syntax (<- vs ->). 
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Example. To demonstrate this process, we consider the ACIoud ex- 
ample in Section 4.2. assign, hostCpu, hostStdevCpu, assignCount, 

hostMem are identified as solver tables as follows: 

• Attribute v in var is a solver attribute of table assign, since v 
does not appear after f oraii. 

• In rule di, given the boolean expression c==v*cpu, c is identified 
as a solver attribute of table hostcpu. Hence, transitively, c is a 
solver attribute of hoststdevcpu in rule d2. 

• In rule d3, v is a known solver attribute of assign and it appears 
in rule head, so v is a solver attribute of table assigncount. 

• Finally, in rule d4, since m depends on v due to the assignment 
M==v*Mem, one can infer that m is a solver attribute of hostMem. 

Once the solver tables are identified, rules di-d4 are trivially 
identified as solver derivation rules. Rules ci and c2 are legal solver 
constraint rules since their rule heads assigncount and hostMem are 
solver tables. 

In the rest of this section, we present the steps required for pro- 
cessing solver derivation and constraint rules. For ease of expo- 
sition, we first do not consider distributed evaluation, which we 
revisit in Section 5.5. 

5.3 Solver Derivation Rules 

To ensure maximum code reuse, solver derivation rules leverage 
the same query processing operators already in place for evaluat- 
ing Datalog rules. As a result, we focus only on the differences 
in evaluating these rules compared to regular Datalog rules. The 
main difference lies in the treatment of solver attributes in selection 
and aggregation expressions. Since solver attribute values are un- 
defined until the solver's optimization modules are executed, they 
cannot be directly evaluated simply based on existing RapidNet ta- 
bles. Instead, constraints are generated from selection and aggrega- 
tion expressions in these rules, and then instantiated within Gecode 
as general constraints for reducing the search space. Cologne cur- 
rently does not allow joins to occur on solver attributes, since ac- 
cording to our experience, there is no such use cases in practice. 
Furthermore, joins on solver attributes are prohibitively expensive 
to implement and complicate our design unnecessarily, since they 
require enumerating all possible values of solver variables. 
Example. We revisit rule di in the ACIoud example in Section 4.2. 
The selection expression c==vtCpu involves an existing solver at- 
tribute V. Hence, a new solver variable c is created within Gecode, 
and a binding between c and v is expressed as a Gecode constraint, 
which expresses the invariant that c has to be equal to v*cpu. 

Likewise, in rule d4, the aggregate sum is computed over a solver 
attribute m. This requires the generation of a Gecode constraint that 
binds a new sum variable to the total of all m values. 

5.4 Solver Constraint Rules 

Unlike solver derivation rules, solver constraint rules simply im- 
pose constraints on existing solver variables, but do not derive new 
ones. However, the compilation process share similarities in the 
treatment of selection and aggregation expressions that involve 
solver attributes. The main difference lies in the fact that each 
solver constraint rule itself results in the generation of a Gecode 
constraint. 

Example. We use as example rule c2 in Section 4.2 to illustrate. 
Since the selection expression Mem<=M involves solver attribute m, 
we impose a Gecode solver constraint expressing that host memory 
M should be less than or equal to the memory capacity Mem. This has 
the effect of pruning the search space when the rule is evaluated. 



5.5 Distributed Solving 

Finally, we describe plan generation involving Colog rules with 
location specifiers to capture distributed computations. We focus 
on solver derivation and constraint rules that involve distribution, 
and describe these modifications with respect to Sections 5.3 and 
5.4. 

At a high level, Cologne uses RapidNet for executing distributed 
rules whose predicates span across multiple nodes. The basic mech- 
anism is not unlike PSN evaluation for distributed Datalog pro- 
grams [18]. Each distributed solver derivation or constraint rule 
(with multiple distinct location specifiers) is rewritten using a local- 
ization rewrite [19] step. This transformation results in rule bodies 
that can be executed locally, and rule heads that can be derived and 
sent across nodes. The beauty of this rewrite is that even if the 
original program expresses distributed derivations and constraints, 
this rewrite process will realize multiple centralized local COP op- 
erations at different nodes, and have the output of COP operations 
via derivations sent across nodes. This allows us to implement a 
distributed solver that can perform incremental and distributed con- 
straint optimization. 

Example. We illustrate distributed solving using the FoUow-the- 
Sun orchestration program in Section 4.3. Rule d2 is a solver deriva- 
tion rule that spans across two nodes x and y. During COP execu- 
tion, d2 retrieves rule body tables link and curVm from node y to 
perform solver derivation. In Cologne. d2 is internally rewritten as 
following two rules via the localization rewrite: 

d21 tmp {@X, Y, D, Rl) <- llnk(@Y,X), curVm ( @ Y, D, Rl ) . 
d22 nborNextVm (@X, Y, D, R) <- tmp (@X, Y, D, Rl) , 
mlgVm (@X, Y, D, R2 ) , R==R1+R2. 

Rule d2 1 is a regular distributed Datalog rule, whose rule body 
is the tables with location y in d2. Its rule head is an intermediate 
regular table tmp, which combines all the attributes from its rule 
body. In essence, rule d2i results in table tmp generation at node y 
and sent over the network to x. This rewrite is handled transparently 
by RapidNet's distributed query engine. Rule d22 is a centralized 
solver derivation rule, which can be executed using the mechanism 
described in Section 5.3. 

6. EVALUATION 

This section provides a performance evaluation of Cologne. Our 
prototype system is developed using the RapidNet declarative net- 
working engine [5] and the Gecode [2] constraint solver. Cologne 
takes as input policy goals and constraints written in Colog, and 
then generates RapidNet and Gecode in C++, using the compila- 
tion process described in Section 5. 

Our experiments are carried out using a combination of realis- 
tic network simulations, and actual distributed deployments, using 
production traces. In our simulation-based experiments, wc use 
RapidNet's built-in support for the ns-3 simulator [3], an emerging 
discrete event-driven simulator which emulates all layers of the net- 
work stack. This allows us to run Cologne instances in a simulated 
network environment and evaluate Cologne distributed capabilities. 
In addition, we can also run our experiments under an implementa- 
tion mode, which enables users to run the same Cologne instances, 
but uses actual sockets (instead of ns-3) to allow Cologne instances 
deployed on real physical nodes to communicate with each other. 

Our evaluation aims to demonstrate the following. First, Cologne 
is a general platform that is capable of enabling a wide range of dis- 
tributed systems optimizations. Second, most of the policies spec- 
ified in Cologne result in orders of magnitude reduction in code 
size compared to imperative implementations. Third, Cologne in- 
curs low communication overhead and small memory footprint, re- 
quires low compilation time, and converges quickly at runtime for 
distributed executions. 
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Our evaluation section is organized around various use cases 
tliat we have presented in Section 3. Tliese include: (1) ACloud 
load balancing orchestration (Section 4.2); (2) Follow-the-Sun or- 
chestration (Section 4.3), and (3) wireless channel selection (Sec- 
tion 3.2). Our cloud orchestration use cases derive their input data 
from actual data center traces obtained from a large hosting com- 
pany. In our evaluations, we use a combination of running Cologne 
over the ns-3 simulator, and deployment on an actual wireless 
testbed [4]. 

6.1 Compactness of Colog Programs 

We first provide evidence to demonstrate the compactness of our 
Colog implementations, by comparing the number of rules in Colog 
and the generated C++ code. 



Protocol 


Colog 


Imperative (C++) 


ACloud (centralized) 


10 


935 


Follow-the-Sun (centralized) 


16 


1487 


Follow-the-Sun (distributed) 


32 


3112 


Wireless (Centralized) 


35 


3229 


Wireless (Distributed) 


48 


4445 



Table 2: Colog and Compiled C++ comparison. 



Table 2 illustrates the compactness of Colog, by comparing the 
number of Colog rules (2nd column) for the five representative pro- 
grams we have implemented against the actual number of lines of 
code (LOC) in the generated RapidNet and Gecode C++ code (3rd 
column) using sioccount. Each Colog program includes all rules 
required to implement Gecode solving and RapidNet distributed 
communications. The generated imperative code is approximately 
lOOX the size of the equivalent Colog program. The generated 
code is a good estimation on the LOC required by a programmer 
to implement these protocols in a traditional imperative language. 
In fact, Colog's reduction in code size should be viewed as a lower 
bound. This is because the generated C++ code implements only 
the rule processing logic, and does not include various Cologne's 
built-in libraries, e.g. Gecode's constraint solving modules and the 
network layers provided by RapidNet. These built-in libraries need 
to be written only once, and are reused across all protocols written 
in Colog. 

While a detailed user study will allow us to comprehensively 
validate the usability of Colog, we note that the orders of magnitude 
reduction in code size makes Colog programs significantly easier to 
fast model complex problems, understand, debug and extend than 
multi-thousand-line imperative alternatives. 

6.2 Use Case #1: ACloud 

In our first set of experiments, we perform a trace-driven evalu- 
ation of the ACloud scenario. Here, we assume a single cloud con- 
troller deployed with Cologne, running the centralized ACloud pro- 
gram written in Colog (Section 4.2). Benchmarking the centralized 
program first allows us to isolate the overhead of the solver, without 
adding communication overhead incurred by distributed solving. 

Experimental workload. As input to the experiment, we use 
data center traces obtained from a large hosting company in the 
US. The data contains up to 248 customers hosted on a total of 
1,740 statically allocated physical processors (PPs). Each customer 
application is deployed on a subset of the PPs. The entire trace is 
one-month in duration, and the trace primarily consists of sampling 
of CPU and memory utilization at each PP gathered at 300 seconds 
interval. 

Based on the trace, we generate a workload in a hypothetical 
cloud environment similar to ACloud where there are 15 physical 
machines geographically dispersed across 3 data centers (5 hosts 



each). Each physical machine has 32GB memory. We preallocate 
80 migratable VMs on each of 12 hosts, and the other 3 hosts serve 
as storage servers for each of the three data centers. This allows 
us to simulate a deployment scenario involving about 1000 VMs. 
We next use the trace to derive the workload as a series of VM 
operations: 

• VM spawn: CPU demand (% PP used) is aggregated over all 
PPs belonging to a customer at every time interval. We compute 
the average CPU load, assuming that load is equally distributed 
among the allocated VMs. Whenever a customer's average CPU 
load per VM exceeds a predefined high threshold (80% in our 
experiment) and there are no free VMs available, one additional 
VM is spawned on a random host by cloning from an image tem- 
plate. 

• VM stop and start: Whenever a customer's average CPU load 
drops below a predefined low threshold (20% in our experiment), 
one of its VMs is powered off to save resources (e.g. energy and 
memory). We assume that powered-off VMs are not reclaimed 
by the cloud. Customers may bring their VMs back by powering 
them on when the CPU demands become high later. 

Using the above workload, the ACloud program takes as input 
vm(vid, cpu,Mem) and host (Hid, cpu, Mem) tables, which are con- 
tinuously being updated by the workload generator as the trace is 
replayed. 

Policy validation. We compare two ACloud policies against two 
strawman policies (default and heuristic): 

• ACloud. This essentially corresponds to the Colog program pre- 
sented in Section 4.2. We configure the ACloud program to pe- 
riodically execute every 10 minutes to perform a COP compu- 
tation for orchestrating load balancing via VM migration within 
each data center. To avoid migrating VMs with very low CPUs, 
the vm table only includes VMs whose CPU utilization is larger 
than 20%. 

• ACloud (M). To demonstrate the flexibility of Colog, we provide 
a slight variant of the above policy, that limits the number of VM 
migrations within each data center to be no larger than 3 for each 
interval. This requires only minor modifications to the Colog 
program, by adding rules d5-6 and c3 as shown in Section 4.2. 

• Default. A naive strategy, which simply does no migration after 
VMs are initially placed on random hosts. 

• Heuristic. A threshold-based policy that migrates VMs from 
the most loaded host (i.e. with the highest aggregate CPU of the 
VMs running on it) to the least one, until the most-to-least load 
ratio is below a threshold K (1.05 in our experiment). Heuristic 
emulates an ad-hoc strategy that a cloud operator may adopt in 
the absence of Cologne. 

Figure 2 shows the average CPU standard deviation of three data 
centers achieved by the ACloud program over a 4 hours period. We 
observe that ACloud is able to more effectively perform load bal- 
ancing, achieving a 98.1% and 87.8% reduction of the degree of 
CPU load imbalance as compared to Default and Heuristic, respec- 
tively. ACloud (M) also performs favorably compared to Default 
and Heuristic, resulting in a marginal increase in standard devia- 
tion. 

Figure 3 shows that on average, ACloud migrates 20.3 VM mi- 
grations every interval. On the contrary, ACloud (M) (with migra- 
tion constraint) substantially reduces the number of VM migrations 
to 9 VMs per interval (3 per data center). 
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Compilation and runtime overhead. The Colog program is 
compiled and executed on an Intel Quad core 2.33GHz PC with 
4GB RAM running Ubuntu 10.04. Compilation takes on average 
0.5 seconds (averaged across 10 runs). For larger-scale data centers 
with more migratable VMs, the solver will require exponentially 
more time to terminate. This makes it hard to reach the optimal 
solution in reasonable time. As a result, we limit each solver's 
COP execution time to 10 seconds. Nevertheless, we note from our 
results that the solver output still yields close-to-optimal solutions. 
The memory footprint is 9MB (on average), and 20MB (maximum) 
for the solver, and 12MB (relatively stable) for the base RapidNet 
program. 

6.3 Use Case #2: FoUow-the-Sun 

Our second evaluation is based on the FoUow-the-Sun scenario. 
We use the distributed Colog program (Section 4.3) for implement- 
ing the Follow-the-Sun policies. The focus of our evaluation is 
to validate the effectiveness of the Follow-the-Sun program at re- 
ducing total cost for cloud providers, and to examine the scala- 
bility, convergence time and overhead of distributed solving using 
Cologne. Our evaluation is carried out by running Cologne in sim- 
ulation mode, with communication directed across all Cologne in- 
stances through the ns-3 simulator We configure the underlying 
network to use ns-3's built-in 10Mbps Ethernet, and all communi- 
cation is done via UDP messaging. 

Experimental workload. Our experiment setup consists of mul- 
tiple data centers geographically distributed at different locations. 
We conducted 5 experimental runs, where we vary the number of 
data centers from 2 to 10. For each network size, we execute the 
distributed Colog program once to determine the VM migrations 
that minimize the cloud providers' total cost. Recall from Sec- 
tion 4.3 that this program executes in a distributed fashion, where 
each node runs a local COP, exchanges optimization outputs and 
reoptimizes, until a fixpoint is reached. 

The data centers are connected via random links with an aver- 
age network degree of 3. In the absence of actual traces, our ex- 
perimental workload (in particular, operating and communication 
and migration costs) here are synthetically generated. However, 
the results still provide insight on the communication/computation 
overhead and effectiveness of the Follow-the-Sun program. 

Each data center has a resource capacity of 60 units of migrat- 
able VMs (the unit here is by no means actual, e.g. one unit can 
denote 100 physical VMs). Data centers have a random placement 
of current VMs for demands at different locations, ranging from 
to 10. Given that data centers may span across geographic regions, 
communication and migration costs between data centers may dif- 
fer As a result, between any two neighboring data centers, we 
generate the communication cost randomly from 50 to 100, and the 



migration cost from 10 to 20. The operating cost is fixed at 10 for 
all data centers. 

Policy validation. Figure 4 shows the total costs (migration, op- 
erating, and communication) over time, while the Follow-the-Sun 
program executes to a fixpoint in a distributed fashion. The total 
cost corresponds to the aggcost (optimization goal) in the program 
in Section 4.3. To make it comparable across experimental runs 
with different network sizes, we normalize the total cost so that its 
initial value is 100% when the COP execution starts. We observe 
that in all experiments, Follow-the-Sun achieves a cost reduction 
after each round of distributed COP execution. Overall the cost 
reduction ranges from 40.4% to 1 1.2%, as the number of data cen- 
ters increases from 2 to 10. As the network size gets larger, the 
cost reduction is less apparent. This is because distributed solving 
approximates the optimal solution. As the search space of COP 
execution grows exponentially with the problem size, it becomes 
harder for the solver to reach the optimal solution. 

To demonstrate the flexibility of Colog in enabling different 
Follow-the-Sun policies, we modify the original Follow-the-Sun 
program slightly to limit the number of migrations between any two 
data centers to be less than or equal to 20, achieved with rules dii 
and c3 as introduced in Section 4.3. This modified policy achieves 
comparable cost reduction ratios and convergence times as before, 
while reducing the number of VM migrations by 24% on average. 

Compilation and runtime overhead. The compilation time of 
the program is 0.6 seconds on average for 10 runs. Figure 4 indi- 
cates that as the network size scales up, the program takes a longer 
time to converge to a fixpoint. This is due to more rounds of link 
negotiations. The periodic timers between each individual link ne- 
gotiation is 5 seconds in our experiment. Since the solver computa- 
tion only requires input information within a node's neighborhood, 
each per-link COP computation during negotiation is highly effi- 
cient and completes within 0.5 seconds on average. The memory 
footprint is tiny, with 172KB (average) and 4 1 0KB (maximum) for 
the solver, and 12MB for the RapidNet base program. 

In terms of bandwidth utilization, we measure the communi- 
cation overhead during distributed COP execution. The per-node 
communication overhead is shown in Figure 5. We note that Cologne 
is highly bandwidth-efficient, with a linear growth as the number of 
data centers scales up. For 10 data centers, the per-node communi- 
cation overhead is about 3.5KBps. 

6.4 Use Case #3: Wireless Channel Selection 

In our final set of experiments, we perform evaluations of using 
Cologne to support declarative wireless channel selection policies 
(Section 3.2 and Appendix A). 

Experimental setup. Our experimental setup consists of de- 
ploying Cologne instances on ORBIT [4], a popular wireless testbed 
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that consist of machines arranged in a grid communicating with 
each other using 802.11. Each ORBIT node is equipped with 1 
GHz VIA Nehemiah processors, 64KB cache and 512MB RAM. 
We selected 30 ORBIT nodes in a 8m x 5m grid to execute one 
Cologne instance each. Each of these 30 nodes utilizes two Atheros 
AR5212-based 802.1 1 a/b/g cards as their data interfaces. 

Policy validation. In our evaluation, we execute three channel 
selection protocols Centralized (Appendix A.2), Distributed (Ap- 
pendix A. 3) and Cross-layer, a distributed cross-layer protocol [14] 
written in Colog that integrates and optimizes across channel selec- 
tion and routing policy decisions. We compare with two base line 
protocols Identical-Ch and 1-Interface. In 1-Interface all nodes 
communicate with each other using one interface and hence a com- 
mon channel. In Identical-Ch [12], the same set of channels are as- 
signed to the interfaces of every node, and a centralized constraint 
solver then assigns each link to use one of these interfaces. 

We injected packets into Cologne instances with increasing send- 
ing rate, and then measure the aggregate network throughput de- 
fined in terms of network-wide aggregate data packet transmissions 
that are successfully received by destination nodes. The result is 
depicted in Figure 6. We make the following observations. First, 
centralized and distributed protocols implemented in Cologne sig- 
nificantly outperform single-interface and identical channel assign- 
ment solutions. The relative differences and scalability trends of 
these protocols are consistent with what one would expect in imper- 
ative implementations. Second, cross-layer protocol outperforms 
other protocols and exhibits the best overall performance in terms 
of high throughput and low loss rate. 

In our second experiment, we fix the channel selection proto- 
col to be cross-layer and vary the channel selection policies. We 
use a simulated network setup with 30 nodes. Figure 7 highlights 
the capabilities of Cologne to handle policy variations with minor 
changes to the input Colog policy rules. Specifically, we vary the 
policies in two ways. First, Restricted Channels reduces the num- 
ber of available channels for each node by an average of 20%. This 
emulates the situation where some channels are no longer available 
due to external factors, e.g. decreased signal strength, the presence 
of primary users, or geographical spectrum usage limits. Second, 
1-hop Interference uses a different cost assignment function to con- 
sider only one-hop interference [28]. As a basis of comparison, 2- 
hop Interference shows our original channel selection policy used 
in prior experiments. We observe that for Restricted Channels, the 
throughput decreases by 35.9%. With the additional use of one-hop 
interference model, the throughput further reduces by an average of 
6.9%, indicating that the two-hop interference model does a better 
job in ensuring channel diversity. 

Compilation and runtime overhead. The compilation time of 



Centralized and Distributed is 1.2 seconds and 1.6 seconds respec- 
tively. In terms of convergence time. Centralized requires less than 
30 seconds to perform channel selection. The execution time is 
dominated by the computation overhead of the Gecode solver. The 
distributed protocols converge quickly as well - at 40 seconds and 
80 seconds respectively for Distributed and Cross-layer. Since the 
solver computation only requires input channel information within 
a node's neighborhood, each per-link COP computation during ne- 
gotiation is highly efficient and completes within 0.2 seconds. For 
bandwidth utilization. Distributed, Cross-layer are both bandwidth 
efficient, requiring only per-node average bandwidth utilization of 
1.57KBps, 1.5SKBps respectively for computing channel selec- 
tion from scratch. In all cases, memory footprint is modest (about 
12MB). 



7. RELATED WORK 

In our prior work, we made initial attempts at developing special- 
ized optimization platforms tailored towards centralized cloud re- 
source orchestration [16] and wireless network configuration [14], 
This paper generalizes ideas from these early experiences, to de- 
velop a general framework, a declarative programming language, 
and corresponding compilation techniques. Consequently, Cologne 
is targeted as a general-purpose distributed constraint optimization 
platform that can support the original use cases and more. In doing 
so, we have also enhanced the ACloud and Follow-the-Sun policies 
through the use of Colog. 

Prior to Cologne, there have been a variety of systems that use 
declarative logic-based policy languages to express constraint op- 
timization problems in resource management of distribute com- 
puting systems. [27] proposes continuous optimization based on 
declaratively specified policies for autonomic computing. [24] de- 
scribes a model for automated policy-based construction as a goal 
satisfaction problem in utility computing environments. The XSB 
engine [6] integrates a tabled Prolog engine with a constraint solver. 
Rhizoma [29] proposes using rule-based language and constraint 
solving programming to optimize resource allocation. [21] uses 
a logic-based interface to a SAT solver to automatically generate 
configuration solution for a single data center. [10] describes com- 
piling and executing centralized declarative modeling languages to 
Gecode programs. 

Unlike the above systems, Cologne is designed to be a gen- 
eral declarative distributed platform for constraint optimizations. It 
first provides a general declarative policy language-Cotog, which 
is user-friendly for constraint solving modeling and results in or- 
ders of magnitude code size reduction compared to imperative al- 
ternatives. Another unique feature of Cologne is its support for 
distributed optimizations, achieved by using the Colog language 
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which supports distribution, and the integration of a distributed 
query engine with a constraint solver. Cologne platform supports 
both simulation and deployment modes. This enables one to first 
simulate distributed COP execution within a controllable network 
environment and then physically deploy the system on real devices. 

8. CONCLUSION 

In this paper, we have presented the Cologne platform for declar- 
ative constraint optimizations in distributed systems. We argue that 
such a platform has tremendous practical value in faciUtating ex- 
tensible distributed systems optimizations. We discuss two con- 
crete use cases based on cloud resource orchestration and wireless 
network configurations, and demonstrate how the Colog language 
enables a wide range of policies to be customized in support of 
these two scenarios. We have proposed novel query processing 
functionalities, that extend basic distributed Datalog's PSN [18] 
evaluation strategies with solver modules, and compilation tech- 
niques that rewrite rule selection and aggregation expressions into 
solver constraints. We have implemented a complete prototype 
based on RapidNet declarative networking engine and Gecode con- 
straint solver. Our evaluation results demonstrate the feasibility of 
Cologne, both in terms of the wide range of policies supported, and 
the efficiency of the platform itself. 

As future work, we plan to explore additional use cases in a wide 
range of emerging domains that involve distributed COPs, includ- 
ing decentralized data analysis and model fitting, resource alloca- 
tions in other distributed systems, network design and optimiza- 
tions, etc. 
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APPENDIX 

A. DECLARATIVE CHANNEL SELECTION 

We first formulate wireless channel selection as a constraint op- 
timization problem (COP), followed by presenting its equivalent 
Colog programs (both centralized and distributed). 

A.1 COP Formulation 

In wireless channel selection, the optimization variables are the 
channels to be assigned to each communication Unk, while the val- 
ues are chosen from candidate channels available to each node. The 

goal in this case is to minimize the likelihood of interference among 
conflicting links, which maps into the well-known graph -coloring 
problem [13]. 

We consider the following example that avoids interference based 
on the one-hop interference model [28]. In this model, any two 
adjacent links are considered to interfere with each other if they 
both use channels whose frequency bands are closer than a certain 
threshold. The formulation is as follows: 

Input domain and variables: Consider a network G = {V,E), 
where there are nodes = {1, 2, . . . , N} and edges E CV x V. 
Each node x has a set of channels Px ciurently occupied by primary 
users within its vicinity. The number of interfaces of each node is 

Optimization goal: For any two adjacent nodes x, y £ V, Ixy 
denotes the link between x and y. Channel assignment selects a 
channel Cxy for each Unk Ixy to meet the following optimization 
goal: 

min ^ cost{cxy,Cxz) (7) 
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where cost(cxy, c^z) assigns a unit penalty if adjacent channel 
assignments Cxy and Cxz are separated by less than a specified fre- 
quency threshold Fmindiff- 

Constraints: The optimization goal has to be achieved under the 
following three constraints: 

yixy e E, Cxy i Px (9) 
yixy £ E, Cxy = Cyx (10) 
WxeV,] y Cxy\<ix (11) 

(9) expresses the constraint that a node should not use channels 
currently occupied by primary users within its vicinity. (10) re- 
quires two adjacent nodes to coirrniunicate with each other using 
the same channel. (11) guarantees the number of assigned chan- 
nels is no more than radio interfaces. 

A.2 Centralized Channel Selection 

In centralized channel selection [23, 8], a channel manager is 
deployed on a single node in the network. Typically, this node 
is a designated server node, or is chosen among peers via a sepa- 
rate leader election protocol. The centralized manager collects the 
network status information from each node in the network - this 
includes their neighborhood information, available channels, and 
any additional local policies. The following Colog program takes 
as input the link table, which stores the gathered network topology 
information, and specifies the one-hop interference model COP for- 
mulation described in Section A. 1 . 

goal minimize C In totalCost (C) 
var assign(X,Y,C) forall link(X,Y) 

// cost derivation rules 

dl cost (X, Y, Z, C) <- assign (X, Y, CI) , assign (X, Z, C2) , 

Y!=Z, (C==l) == ( I C1-C2 I <F_mindiff ) . 
d2 totalCost (SUM<C>) <- cost(X,Y,Z,C) . 

// primary user constraint 

cl assign (X, Y, C) -> prlmaryUser (X, 02) , C ! =C2 . 

// channel symmetry constraint 

c2 assign(X,Y,C) -> assign (Y, X, C) . 

// Interface constraint 

d3 unlqueChannel (X, UNIQUE<C>) <- assign (X, Y, C) . 

c3 unlqueChannel {X, Count) -> numlnterf ace (X, K) , Count<-K. 

Optimization goal and variables: The goal in this case is to 
minimize the cost attribute c in totaicost, while assigning channel 
variables assign for all communication links. Each entry of the 
assign (X, Y, c) table indicates channel c is used for communication 
between x and y. 

Solver derivations: Rule di sets cost c to 1 for each 
cost (X, Y, z, c) tuple if the chosen channels that x uses to commu- 
nicate with adjacent nodes y and z are interfering. Rule d2 sums the 
number of interfering channels among adjacent links in the entire 
network, and stores the result in totaicost. 

Solver constraints: The constraints ci-c3 encode the three con- 
straints introduced in COP formulation in Section A. 1 . 

In some wireless deployments, e.g. IEEE 802.11, the two-hop 
interference model [28] is often considered a more accurate mea- 
surement of interference. This model considers interference that 
results from any two links using similar channels within two hops 
of each other. The two-hop interference model requires minor mod- 
ifications to rule dl as follows: 



d3 cost (X,y, Z,W,C) <- assign (X, Y, Cl) , link.(Z,X), 
assign (Z,W,C2) , X!=W, Y ! =W, Y!=Z, 

(C==l) == ( I C1-C2 I <F_mlndlf f ) . 

The above rule considers four adjacent nodes w, z, x, and y, 
and assigns a cost of 1 to node x's channel assignment with y 
(assign (X, Y, Cl ) ), if there exists a neighbor z of x that is currently 
using channel C2 that interferes with ci to communicate with an- 
other node w. The above policy requires only adding one additional 
link (z, x) predicate to the original rule di, demonstrating the cus- 
tomizability of Colog. Together with rule di, one can assign costs 
to both one-hop and two-hop interference models. 

A.3 Distributed Channel Selection 

We next demonstrate Cologne's ability to implement distributed 
channel selection. Our example here is based on a variant of dis- 
tributed greedy protocol proposed in [25]. This example highlights 
Cologne's abiUty to support distributed COP computations, where 
nodes compute channel assignments based on local neighborhood 
information, and then exchange channel assignments with neigh- 
bors to perform further COP computations. 

The protocol works as follows. Periodically, each node randomly 
selects one of its links to start a link negotiation process with its 
neighbor. This is similar to distributed Colog program for FoUow- 
the-Sun in Section 4.3. Once a link is selected for channel assign- 
ment, the result of link negotiation is stored in table setiink (x, y) . 
The negotiation process then solves a local COP and assigns a 
channel such that interference is minimized. The following Colog 
program implements the local COP operation at every node x for 
performing channel assignment. The output of the program sets 
the channel assign (x, y, c) for one of its Unks link (x, y) (chosen 
for the current channel negotiation process) based on the two-hop 
interference model: 

goal minimize C In totalCost ( @X, C ) 

var assign (@X, Y, C) forall setLlnk (@x, Y) 

/ / cost derivation for two-hop Interference model 
dl cost (@X, Y, Z, W, C) <- assign (@X, Y, Cl) , llnk(ez,X), 
assign (SZ, W, C2) , X!=W, Y!=W, Y!=Z, 
(C==l)==(|Cl-C2|<F_mindiff) . 
d2 totalCost (ex, SUM<C>) <- cost(@X,Y,Z,W,C) . 

// primary user constraint 

cl assign (ex, Y,C) -> prlmaryUser (ex, C2) , C!=C2. 
c2 assign(gx,Y,C) -> prlmaryUser (eY, C2) , C ! =C2 . 

// propagate channels to ensure symmetry 
rl assign(gY,X,C) <- assign (BX, Y, C) . 

The distributed program is similar to the centralized equivalent 
presented in Section A.2, with the following differences: 

While the centralized channel selection searches for all combina- 
tions of channel assignments for all links, the distributed equivalent 
restricts channel selection to a single link one at a time, where the 
selected link is represented by serLink (@x, y) based on the negoti- 
ation process. For this particular link, the COP execution takes as 
input its local neighbor set (link) and all currently assigned chan- 
nels (assign) for itself and nodes in the local neighborhood. This 
means that the COP execution is an approximation based on local 
information gathered from a node's neighborhood. 

Specifically, distributed solver rule di enables node x to collect 
the current set of channel assignments for its immediate neighbors 
and derive the cost based on the two-hop interference model. In 
executing the channel selection for the current link, constraint ci-2 
express that the channel assignment for link (ex, y) does not equal 
to any channels used by primaryuser. Once a channel is set at 
node X after COP execution, the channel-to-link assignment is then 
propagated to neighbor y, hence resulting in symmetric channel as- 
signments (rule rl). 
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